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Abstract — We consider the problem of communicating over 
the general discrete memoryless broadcast channel (BC) with 
partially cooperating receivers. In our setup, receivers are able 
to exchange messages over noiseless conference links of finite ca- 
pacities, prior to decoding the messages sent from the transmitter. 
In this paper we formulate the general problem of broadcast with 
cooperation. We first find the capacity region for the case where 
the BC is physically degraded. Then, we give achievability results 
for the general broadcast channel, for both the two independent 
messages case and the single common message case. 

Index Terms — Broadcast channels, cooperative broadcast, re- 
lay channels, channel capacity, network information theory. 



I. INTRODUCTION 



A. Motivation 



In the classic broadcast scenario the receivers decode their 
messages independently of each other. However, the increasing 
interest in networking motivates the consideration of broadcast 
scenarios in which each node in the network, besides decoding 
its own information, tries to help other nodes in decoding. 
This problem comes up naturally in sensor networks, where a 
transmitter external to the sensor network wants to download 
data into the network, e.g., to configure the sensor array. 
The concept of cooperation among receivers is also relevant 
to general ad-hoc networks, since such cooperation provides 
a method for increasing the rates without increasing the 
spectrum allocation. Therefore, this motivates the study of the 
effect of receiver cooperation on the rates for the broadcast 
channel. 

B. The Discrete Memoryless Broadcast Channel (DMBC) 

The broadcast channel was introduced by Cover in [1]. 
Following this initial work, Bergmans proved an achievability 
result for the degraded BC, [2], and also a partial converse 
that holds only for the Gaussian broadcast channel [3]; in [4] 
Gallager established a converse that holds for any discrete 
memoryless degraded broadcast channel. In [5] El-Gamal 
generalized the capacity result for the degraded broadcast 
channel to the "more capable" case, and in [6] and [7] he 
showed that feedback does not increase the capacity region 
for the physically degraded case. Several other classes of 
broadcast channels were studied in the following years. For 
example, the sum and product of two degraded broadcast 
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channels were considered in [8], and in [9], [10] and [11] 
the deterministic broadcast channel was analyzed. 

For the general broadcast channel, Cover derived an achiev- 
able rate region for the case of two independent senders 
in [12]. In [13] Korner and Marton considered the capacity 
of general broadcast channels with degraded message sets. 
The best achievable region and the best upper bound for 
the two independent senders case were derived by Marton 
in [14], and a simple proof of Marton's achievable region 
appeared later in [15]. Another upper bound for the general 
broadcast channel, the so-called degraded, same-marginals 
(DSM) bound, was presented in [16]. This bound is weaker 
than the upper bound in [14] but stronger than Sato's upper 
bound previously presented in [17]. We note, however, that 
while Marton's upper bound is the strongest, it is valid only 
for the two-receiver case, while Sato's bound and the DSM 
bound can be extended to more than two receivers. The effect 
of feedback on the capacity of the Gaussian broadcast channel 
was studied in [18] and [19], and in [20] the case of correlated 
sources was considered. A survey on the topic, with extensive 
references to previous work, can be found in [21]. In recent 
years the Multiple-Input-Multiple-Output (MIMO) Gaussian 
broadcast channel has attracted a lot of attention. Initially, the 
sum-rate capacity was characterized in [22], [23], [24], [25], 
and finally, in [26] the capacity region was obtained. 

None of the early work on the DMBC considered direct 
cooperation between the receivers. In the cooperative broad- 
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Fig. 1. Broadcast channel with two private messages and cooperating 
receivers. 

cast scenario, a single transmitter sends two messages to two 
receivers encoded in a single channel codeword X n , where 
the superscript n denotes the length of a vector. Each of 
the receivers gets a noisy version of the codeword, F" at 
R x i and Y" 2 n at R X 2- After reception, the receivers exchange 
messages over noiseless conference links of finite capacities 
C12 and C21, as depicted in Figure 1. The conference messages 
are, in general, functions of Y" (at R x i), Y£ (at R X 2), and 
the previous messages received from the other decoder. After 
conferencing, each receiver decodes its own message. 

We note that in a recent work, [27], the authors consider 
the problem of interactive decoding of a single broadcast 
message over the independent broadcast channel by a group 
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of cooperating users. In our work we extend this scenario to 
the general channel and also consider the two independent 
senders case. 

C. Cooperative Broadcast: A Combination of Broadcasting 
and Relaying 

The scenario in which one transceiver helps a second 
transceiver in decoding a message is clearly a relay scenario. 
Hence, cooperative broadcast can be viewed as a general- 
ization of the broadcast and relay scenarios into a hybrid 
broadcast/relay system, which better describes future commu- 
nication networks. 

Scenarios of this type have attracted considerable attention 
recently both from the practical and the theoretical aspects. 
From the practical aspect, new protocols are proposed for 
the collaborative broadcast scenario. For example in [28] the 
authors present a protocol for collaborative decision making 
involving broadcasting and relaying. From the theoretical 
aspect, there is a considerable effort invested in characterizing 
the capacity of an entire network. This work started with 
[29] and recent results appear in [30] and the following 
work [31], [32] and [33]. This work focuses on the Gaussian 
case. A complementing approach for studying the performance 
of a network is to combine the basic building blocks of a 
network, namely multiple access, relaying and broadcasting 
and study the capacity of these combinations. The recent 
work on relaying focuses on extending the single relay results 
derived in [34] to the MIMO case (see for example [35]) and 
to the multiple level case [36], [37]. Another recent result 
was introduced in [38] where joint decoding was applied to 
the combined decode-and-forward and estimate-and-forward 
scheme of [34, theorem 7]. A third approach for studying 
the performance of an entire network is the network coding 
approach sparked by the work of [39], which focuses on 
encoding at the nodes for maximizing the network throughput, 
separately from the channel coding. 

In this paper we focus on the combination of broadcast 
and relay. A relevant work in this context is [40], in which 
the capacity of a class of independent relay channels with 
noiseless relay is derived. Note that the case of noiseless relay 
is also related to the Wyner-Ziv problem [41]. This relationship 
will be highlighted in the sequel. Lastly, we note that a recent 
work, [42], presented an achievability result for the general 
DMBC with a single wireless cooperation channel from one 
receiver to the second receiver. This achievable rate region is 
shown to be the capacity region for the physically degraded 
broadcast/relay channel. 

D. Main Contributions and Organization 

In the following we summarize the main contributions of 
this work. 

• We initially study a special case of the general setup 
formulated in Section I-B: the case of the physically 
degraded broadcast channel. Although the physically 
degraded BC is of little practical interest, it is useful 
in developing the coding concept for the general BC 
with cooperation. For the physically degraded BC, we 



present both an achievability result and a converse. To- 
gether, these two results give the capacity region for 
this setup. Furthermore, this new region is shown to 
be a strict enlargement of the classical region without 
cooperation [21]. 

• Next, we give an achievability result for the general BC 
with cooperating receivers. This region is also greater, in 
general, than the classic achievable region given in [14] 
for the broadcast channel. 

• We also consider the case where a single common mes- 
sage is transmitted to both receivers. We consider two 
different cooperation strategies and derive the achievable 
rates for each of them. We also derive an upper bound on 
the achievable rates for this scenario. Here we provide re- 
sults that explicitly link the available cooperation capacity 
to the increase in the rate of information. Lastly, we show 
that for a special case of the general BC, namely when 
one channel is distinctly better than the other, the upper 
and lower bounds coincide, resulting in the capacity for 
that case. 

The rest of this paper is structured as follows: in section II 
we define the mathematical framework. In section III we 
analyze the physically degraded BC, and derive the capac- 
ity region for that case, and in section IV we present an 
achievability result for the general broadcast channel with 
cooperating receivers. Next, section V presents achievability 
results and an upper bound on the rates for the case where only 
a single common message is transmitted. Concluding remarks 
are provided in section VI. 

II. Definitions and Notations 

First, a word about notation: in the following we use H(-) 
to denote the entropy of a discrete random variable (RV), and 
/(•; •) to denote the mutual information between two discrete 
random variables, as defined in [43, Ch. 2]. We denote random 
variables with capital letters - X, Y, etc., and vectors with 
boldface letters, e.g., x, y. We denote by A[ n \x) the weakly 
typical set for the (possibly vector) random variable X, see 
[43, Ch. 3] for the definition of Ai n) (X). When referring 
to a typical set we may omit the random variables from the 
notation, when these variables are clear from the context. We 
denote the cardinality of the finite set A with \\A\\. We use 
X to denote the (discrete and finite) range of X. Finally, 
we denote the probability distribution of the RV X over X 
with p(x) and the conditional distribution of X given Y with 
p(x\y). 

Definition 1: A discrete broadcast channel is a chan- 
nel with discrete input alphabet X, two discrete output 
alphabets, J^i and y 2 , and a probability transition func- 
tion, p(yi,y 2 \x). We denote this channel by the triplet 
{X,p(y 1 ,y 2 \x),y 1 xy 2 ). 

Definition 2: A memoryless broadcast channel is a broad- 
cast channel for which the probability transition function 
of a sequence of n symbols is given by y 2 \x n ) = 

n™=iKyi,-s>y2,j|z;)> where y% = (y M , y M , -, Vk,n), k E 
{1,2}, and x n = {x 1: x 2 , .. .,£„). 

We shall assume the channel to be discrete and memoryless. 
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Definition 3: The physically degraded broadcast channel is 
a broadcast channel in which the probability transition function 
can be decomposed as p(yi,y 2 \x) = p(yi\x)p(y 2 \yi). Hence, 
for the physically degraded BC we have that X — Y\ — Y 2 
form a Markov chain. 

Definition 4: An [R\ 2 , P21) -conference between R x \ and 
R X 2 is defined by two conference message sets W12 = 
{1,2, ...,2 ni?12 }, W21 = {l,2,...,2 nR ^}, and two mapping 
functions, h\2 and /121 which map the received sequence of n 
symbols and the conference messages at one receiver into a 
message transmitted to the other receiver: 

h 12 : y[ l x W21 ^ W12, 
h 2 i : x W12 h-> W21. 

We note that this is not the most general definition of a 
conference, see for example [44], [45] for a more general 
form. In this paper we consider only conferences in which 
each receiver sends at most one message to the other receiver. 
Note that there are cases where a single conference message 
is enough to achieve capacity: for example, in section III a 
single conference step achieves capacity for the physically 
degraded broadcast channel, and in [45] a single conference 
step achieves capacity for the discrete memoryless multiple 
access channel counterpart of the setup discussed here. 

Definition 5: A (C12, C2i)-admissible conference is a con- 
ference for which R12 < C12 and R21 < C21. 

Definition 6: A ({2 nRl , 2 nR ' 2 ) , n, (C 12 , C21)) code for the 
broadcast channel with cooperating receivers having confer- 
ence links of capacities C12 and C21 between them, con- 
sists of two sets of integers Wi = {l, 2, 2 nRl }, W 2 = 
{l,2,..., 2 ni?2 }, called message sets, an encoding function 

/:WixW 2 h X n , 

a (C12, C2i)-admissible conference 

h 12 : y? x W21 ^ W12, 
h 2 i : y 2 x W12 ^ W21, 

and two decoding functions 

91 ■ W21 x X 1 h-> Wi, (1) 

92 ■■ W12 x ys ^ W 2 . (2) 

Definition 7: The average probability of error is defined as 
the probability that the decoded message pair is different from 
the transmitted message pair: 

Pj n > = Pr (<7i(W 2 i, Y?) ? W x or g 2 {W 12 ,Y 2 n ) ^ W 2 ) . 

We also define the average probability of error for each 
receiver as: 

= Pr(g 1 (W 21 ,Y 1 n )^W 1 ), (3) 

Pe2 ] = T*(g 2 (W 12 ,Y 2 n )^W 2 ), (4) 

where we assume transmission of n symbols for each code- 
word. By the union bound we have that max |Pg™\ -P e 2^| — 
Pe n) < + Hence, P e (n) implies that both 

in) fn) 

P\ ' — > and P>, ; — > 0, and when both individual error 
probabilities go to zero then P e goes to zero as well. 



In the analysis that follows, we assume that user 1 and user 
2 select their respective messages W\ and W2 independently 
and uniformly over their respective message sets. 

Definition 8: A rate pair (Ri , R2 ) is said to be achievable, 
if there exists a sequence of ((2 nRl , 2 Tli?2 ) , n, (C 12 , C21)) 

(n) 

codes with P e — > as n — > 00. Obviously, this is satisfied 
if both Pg™ — > and P e 2 - * as n increases. 

Definition 9: The capacity region for the discrete memory- 
less broadcast channel with cooperating receivers is the convex 
hull of all achievable rates. 

III. Capacity Region for the Physically Degraded 
Broadcast Channel with Cooperating Receivers 

We consider the physically degraded broadcast channel with 
three independent messages: a private message to each receiver 
and a common message to both. We note that for the physically 
degraded channel, following the argument in [43, theorem 
14.6.4], we can incorporate a common rate to both receivers by 
replacing R2, the private rate to the bad receiver, obtained for 
the two private messages case with R0+R2, where Ro denotes 
the rate of the common information. Without cooperation, the 
capacity region for the physically degraded BC X — Y\ — Y 2 
given in [43, theorem 14.6.4], is the convex hull of all the rate 
triplets (Rq, Ri, R2) that satisfy 

Pi < I{X;Yx\U), (5) 
P0 + P2 < I(U;Y 2 ), (6) 

for some joint distribution p(u)p(x\u)p(yi\x)p(y 2 \yi), where 

||W||<min{||^||,||X||,||^||}. (7) 

Next, consider cooperation between receivers over the phys- 
ically degraded BC. First note that for this case, the link 
from R X 2 to R x \ does not contribute to increasing the rates 
due to cooperation, and that only the link from R x \ to R X 2 
does. This is due to the data processing inequality (see [43, 
theorem 2.8.1]): since X — Y\—Y 2 form a Markov chain, any 
information about X contained in Y 2 will also be contained 
in Yi, and thus conferencing cannot help: 

I{X-Y u Y 2 )=I{X;Y l )+I{X;Y 2 \Y 1 )=I{X-Y l ). 

= 

For the rest of this section then, we shall consider only a 
communication link from the good receiver R x \, to the bad 
receiver R X 2 (i.e. we set C 2 \ = 0). This implies that W 2 \ is a 
constant and we can thus omit it from the analysis. We begin 
with a statement of the theorem: 

Theorem 1: The capacity region for sending independent 
information over the discrete memoryless physically degraded 
broadcast channel X — Y± — Y 2 , with cooperating receivers 
having a noiseless conference link of capacity C\ 2 , as defined 
in Section II, is the convex hull of all rate triplets (Ro, Ri,R 2 ) 
that satisfy 

Pi < I{X;Yi\U), (8) 
P0 + P2 < min (/(£/; Fx), /([/;y 2 ) + C 12 ), (9) 



4 



for some joint distribution p(u)p(x\u)p{yi,y2\x), where the 
auxiliary random variable U has cardinality bounded by 
||«||<niin{||Af||,||;yi||}. 

We note that this result presented in [46] was simultaneously 
derived in [42] for the case of a wireless relay. 

A. Achievability Proof 

In this section, we show that the rate triplets of theorem 1 
are indeed achievable. We will show that the region defined by 
(8) and (9) with Rq = is achievable. Incorporating Ro > 
easily follows as explained earlier. 

1 ) Overview of Coding Strategy: The coding strategy is a 
combination of a broadcast code as an "outer" code used to 
split the rate between R x \ and R X 2, and an "inner" code for 
R X 2, using the code construction for the physically degraded 
relay channel, described in [34, theorem 1]. We first generate 
codewords U n for R X 2, according to the relay channel code 
construction. Then, the codewords for R X 2 are used as "cloud 
centers" for the codewords transmitted to R x i (which are also 
the output to the channel). Upon reception, R x i decodes both 
its own message and the message for R X 2, and then uses the 
relay code selection to select the message relayed to R X 2- R X 2 
uses its received signal, Y 2 n , to generate a list of possible U n 
candidates, and then uses the information from R x \ to resolve 
for the correct codeword. 

2) Details of Coding Strategy: 

a) Code Generation: 

1) Consider first the set of Mr — 2 nCl2 relay messages. 
These are the messages that the relay R x i transmits to 
R X 2 through the noiseless finite capacity conference link 
between the two receivers. Index these messages by s, 
where s £ {1, 2, Mr}. 

Next, fix p(u) and p(x\u). 

2) For each index s £ [1, Mr], generate 2 nFt2 conditionally 
independent codewords u(w2\s) ~ Il"=iP( u i)' wnere 
w 2 £ {1,2,..., 2"^}. 

3) For each codeword u(w2\s) generate 2 nRl con- 
ditionally independent codewords x(wi,W2\s) = 
x(w 1 \u(w 2 \s)) ~ T[i=iP( x i\ u i( w 2\ s ))> where w 1 £ 
{1,2,..., 2™^}. 

4) Randomly partition the message set for R X 2, 
{1,2,..., 2"^}, into M R sets {S u S 2 , S Mr }, 
by independently and uniformly assigning to each 
message an index in [1, Mr]. 

b) Encoding Procedure: Consider transmission of B 
blocks, each block transmitted using n channel symbols. Here 
we use nB symbol transmissions to transmit B — 1 message 
pairs (wi t i,w 2 ,i) £ [l,2 nRl ] x [l,2 nfla ], i = 1, 2, . . . , B- 1. 
As B — > oo we have that the rate (Ri,R 2 )^r- — > (i?i,7? 2 ). 
Hence, any rate pair achievable without blocking can be ap- 
proached arbitrarily close with blocking as well. Let w\.i and 
W2.i be the messages intended for R x \ and R X 2 respectively, 
at the i'th block, and also assume that w 2 ,i-i £ S Si . Rxi has 
an estimate W2,i-i of the message sent to R X 2 at block 

Let w 2 .i-i £ S~.. At the i'th block the transmitter outputs the 
codeword x(iux i, W 2 j|sj), and R x i sends the index Sj to R X 2 
through the noiseless conference link. 



c) Decoding Procedure: Assume first that up to the 
end of the (i — l)'th block there was no decoding er- 
ror. Hence, at the end of the (i — l)'th block, R xl 
knows (w li i,wx i 2,...,wi ii - 1 ), (w 2 ,i,w 2i 2,...,w 2 ,i-i) and 
(sx,S2, ... , Si), and R x2 knows (w2,i,w 2 ,2, —,W2,i-2) and 
(si, s 2 , Sj_i). The decoding at block i proceeds as follows: 

1) R x i knows Si from w 2 ,i-i- Hence, R x \ determines 
uniquely (wi ti , w 2 ,i) s.t. 

(u(w2,i\si),x(wi,i, W2,i\si),yi(i)) £ AT'. If there is 
none or there is more than one, an error is declared. 

2) R X 2 receives s ; from R xl . From knowledge of Sj„! 
and y2(i — 1), R X 2 forms a list of possible messages, 
C(i-1) = {w 2 :(y2(i-l),u(w2\s t _ 1 ))£A { e n) y 
Now, R X 2 uses s, to find a unique u>2,i-i £ S Si f] C(i — 
1). If there is none or there is more than one, an error 
is declared. 

3) Analysis of the Probability of Error: The achievable rate 
to R X 2 can be proved using the same technique as in [34, 
theorem 1]. For the ease of description assume that R x i is 
connected via an orthogonal channel to R X 2 and let X' denote 
the channel input from R x \ and Y' the corresponding channel 
output to R x2 - Thus, R X 2 has combined input {Y 2 ,Y'). The 
overall transition matrix is given by 

p(yi,V2,y'\x,x') =p(y 1 ,y 2 \x)p(y'\x'). (10) 

Additionally, we select the transition matrix p(y'\x') and the 
input and output alphabets X' , y' such that the capacity of 
the orthogonal channel X' — Y' is C\2- An example for such a 
selection is letting X' = y' = {0, 1, 2^1 - l}, where [•] 
is denotes the ceil function. Letting [a] denotes the integer part 
of the real number a, we set the channel transition function 
to be 

^ Y \ A >-\ a ,Y'= mod + 2^1,2^1), 

with a selected such that H(Y'\X') = \C 12 ] - C 12 . The 
capacity of this channel is C\2 and is achieved by letting 
P( x ') = 2 i(j" 12 1 . Vx' £ X' . This setup is equivalent to the 
original setup described in section I-B. 

Now consider the rate to R X 2- The Markov chain U — 
X — (Yi,Y 2 ) combined with the condition in (10) implies 
the following probability distribution function (p.d.f.) 

P(u, 2/1,2/2,3/', x') = p{yi,y 2 \u)p{y'\x')p(u, x'). 

Now, applying [34, theorem 1], with p(u, x') = p(u)p{x'), we 
have that (see also [32]) 

R 2 < min {1(17, X'^Y^JiU^X')} 

= min {1(17, X'- Y') + I(U, X'- Y 2 \Y'), I(U; Y,)} 
= min {I(X' ; Y') + I(U; Y'\X') + I(U; Y 2 \Y') 

+I(X';Y 2 \Y',U)J(U;Y 1 )} 
= min{Ci 2 + J([/;y 2 ),7(C7;yi)}. 

Next, consider the rate to R x \. From the proof of [34, theorem 
1] we have that R x i decodes W2- Therefore, R x \ can now use 
successive decoding similar to the decoding at R xl in [43, Ch. 
14.6.2], which imply that the achievable rate to T^i is given 



5 



by Ri < I(X; Y\\U). Combining both bounds we get the rate 
constraints of theorem 1. 

B. Converse Proof 

(ri) 

In this section we prove that for P e — * 0, the rates must 
satisfy the constraints in theorem 1 . First, note that for the case 
of the physically degraded broadcast channel with cooperating 
receivers we have the following Markov chain: 

X n -Y?-(W 12 (yr),Y?). (11) 

Considering the definition of the decoders in (1) and (2), 
and the definition of the probability of error for each of the 
receivers in (3) and (4), we have from Fano's inequality ([43, 
Ch. 2.11]) that 

H{W X \Y?)<P$> log 2 (2 nB - - 1) + h{P^){\2) 

H(W 2 \Y 2 n 7 W 12 (Y 1 n ))<P^ ) lo§2 (2 n/?2 - 1) +h(P^)(U) 
= nS(P^), 

where h{P) is the entropy of a Bernoulli RV with parameter 
P. Note that when P e ( "' then 8{P^ ] ) -> and when 
P£ ] -► then d(P^) -> 0. 
Now, for R x i we have that 

nR x = H(Wi) = IQVuY?) + H(Wi\Y?). 

Applying inequality (12), and then proceeding as in [4] we 
get the bound on R\ as 

n 

nRi < I(X k ;Y hk \U k ) + n5{P^ ] ), 
fe=i 

where U k 4 (Y 1A ,Y lj2 , ...,Y hk . 1 ,W 2 ). 
For R x2 we can write 

nR 2 = H(W 2 ) 

< I(W 2 ;Y 2 n , W 12 (Y 1 n )) + nS(P^) (14) 
= I(W 2 ;Y 2 n ) + I(W 2 ;W 12 (Yn\Y 2 n ) + nS(P%>), 

where the inequality in (a) is due to (13). Proceeding as in [4], 
we bound J(W 2 ;F 2 n ) < £Li I(U k ; Y 2M ). Next, we bound 
I(W 2 ]W 12 (Y?)\Y 2 n ) as follows: 

I{W l2 {Y?);W 2 \Y 2 n ) < H(W l2 {Y?)\Y 2 n ) 

< H{W l2 {Y?)) 

< nCia, (15) 

where the first inequality follows from the definition of mutual 
information, the second is due to removing the conditioning 
and the third is due to the admissibility of the conference. 
Combining both bounds we get that 

n 

nR 2 < I(V k \ Ya,k) + nC u + nS(P^). (16) 
fc=i 



The bound on R 2 can be developed in an alternative way. 
Begin with (14): 

nR 2 < I(W 2 ;Y 2 n ,W 12 (Y^)) + nS(P^) 
< I(W 2 ;Y 2 \Yn+nS(P^) 

n 

fe=i 

where (a) follows from the fact that (Wi, W 2 ) - (Yf, Y 2 n ) - 
(W12, Y 2 ) is a Markov relation and from the data processing 
inequality. Next, we can write 

I{W 2 ;Y 1 , k ,Y 2 , k \Yt\Yt 1 ) 
® IiW^Y^Yt^Yt 1 ) 

= Hp^Y*- 1 ^*- 1 ) - H{Y^ k \Y^\Yt\W 2 ) 

(6) 

< H{Y hk ) - H(Y 1>k \Yl'- 1 ,Yf- 1 , W 2 ) 

( => H(Y ljk )-H(Y ltk \Yf-\W 2 ) 
= I{Y hk ;Y*-\W 2 ) 

= I(Y lik ;U k ), (18) 

where the equality in (a) is due to the physical degradedness 
and memorylessness of the channel, (b) is due to removing the 
conditioning, and (c) is because the Markov chain makes Yik 
independent of Y^ 1 given Y^ 1 . Plugging this into (17), we 
get a second bound on R 2 : 

n 

nR 2 < J2 1{U k ;Y hk ) + nS{P^). 
fc=i 

Collecting the three bounds we have: 
1 " 

Ri < ~ Y^^YiMUk) + S(P^>), (19) 
1 " 

i? 2 < - V I(U k ; Y 2 . k ) + C 12 + 8{Pg>), (20) 
n * — ' 

fe=i 

r 2 < _ V I(U k ; Y 1 , k ) + i(pW). (21) 
n * — ' 

fe=i 

Using the standard time-sharing argument as in [43, Ch. 14.3], 
we can write the averages in (19) - (21) by introducing 
an appropriate time sharing variable, with cardinality upper 
bounded by 4. Therefore, if P e ( " } -► and P$ -> as 
n — > oo, the convex hull of this region can be shown to be 
equivalent to the convex hull of the region defined by 

Ri < I(X;Y!\U), (22) 
R 2 < I(U;Y 2 )+C 12 , (23) 
R 2 <I(U;Y 1 ). (24) 

Finally, the bound on the cardinality of U follows from the 
same arguments as in the converse for the non-cooperative 
case in [4]. Note however, that \\y 2 \\ is absent from the 
minimization on the cardinality (cf. equation (7) for the non- 
cooperative case). The reason is that even when 1 13^11 = 1> 
information to R x2 (represented by the random variable U), 
can be sent through the conference link between the two 
receivers. ■ 
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C. Discussion 

To illustrate the implications of theorem 1, consider the 
physically degraded binary symmetric broadcast channel 
(BSBC) depicted in figure 2. For this channel, theorem 1 



V 




X 





Fig. 2. The physically degraded BSBC. pu, Pi and p-i are the transi- 
tion probabilities at the left, middle and right segments respectively. 

implies that | \U\ \ = 2. Due to the symmetry of the channel, the 
probability distribution of U which maximizes the rates, is a 
symmetric binary distribution, Pr(C7 = 0) = Pr(C7 = 1) = i. 
The resulting capacity region for this case is depicted in figure 
3 for the case where Rq = 0. In the figure, the bottom line 
(dash) is the non-cooperative capacity region, and the top line 
(dash-dot) is the maximum possible sum rate, which requires 
that C\2 > h(j>Y2) — h(pi), where 

hip) = -plog 2 (p) - (1 -p) log 2 (l - p), 

Pl2 = -P2) +Z>2(1 -Vl)- 

This maximum sum-rate of I(X; Y{) is obtained by summing 
the rate to R x \ given by (22) and the maximum possible rate 
for R X 2 given by (24), and using the Markov chain relation 
U — X — Yi , The middle line (solid) is the capacity region for 



KX;Y,) 



I(X;Y 2 )+C 12 
I(X;Y,) 



R, 



— ^ 



R, 



I(X;Y,) 

Fig. 3. The capacity region for the physically degraded BSBC. Top, 
middle and bottom lines correspond to maximum possible coopera- 
tion, partial cooperation and no-cooperation scenarios respectively. 

the partial cooperation case where < C12 < h(pi 2 ) — h(pi). 

As can be seen from this example, the capacity region 
derived in this section is strictly larger than the capacity region 
for the non-cooperation case. Indeed, summing the constraints 
on Rq, Ri and R2 without cooperation (equations (5), (6)), 
results in a maximum achievable sum-rate of 

R0+R1+R2 <I(X;Y X ) -(I(U; Y{) - I(U\Y 2 )), (25) 

where the second term is always positive due to the Markov 
chain U — X — Y± — Y 2 (assuming the degrading channel is 



non-invertible 1 ). In this setup, the maximum possible sum- 
rate, I(X;Yi), is achieved only when U is a constant, and 
thus no information is sent to R x2 . When i? + R 2 > 0, 
because of the relationship R + R 2 < I(U;Y 2 ) < I(U;Yi), 
we cannot achieve the maximum sum-rate of I(X;Yi) to 
R x \. However, summing (23) or (24) with (22), results in a 
maximum achievable sum-rate with cooperating receivers of 

R +R 1 +R 2 <I(X;Y 1 ) 

+ min {0, Cia - (I(U; Yi) - I(U; Y 2 ))} (26) 

Comparing this to non-cooperative sum-rate given by (25), it 
is clear that cooperation allows a net increase in the sum-rate, 
by at most C\ 2 . 

IV. Achievable Rates for the General Broadcast 
Channel with Cooperating Receivers 
For the classic general BC scenario, the best achievability 
result was derived by Marton in [14]. This result states that 
for the general BC, any rate pair (R\, R 2 ) satisfying 

Ri < I(U;Yi), 
R2 < I(V;Y 2 ), 
R1 + R2 < I(U; Y x ) + I(V; Y 2 ) - I(U; V), 

for some joint distribution p(u,v,x,yx,y 2 ) = 
p(u,v,x)p(yi,y 2 \x), is achievable. 

We note that Marton's largest region contains three auxiliary 
RVs, (W, U, V), where W represents information decoded by 
both receivers. Here we use a simplified version, where W is 
set to a constant. 

We now consider cooperation between the receivers. We 
begin with a statement of the theorem: 

Theorem 2: Let (X,p(yi, y 2 \x), 3^1 x 3^) be any discrete 
memoryless broadcast channel, with cooperating receivers 
having noiseless conference links of finite capacities C\ 2 and 
C 2 \, as defined in Section II. Then, for sending independent 
information, any rate pair (R\,R 2 ) satisfying 

Ri < R(U), 
R 2 < R{V), 

R 2 < R{U)+R{V)-I(U;V), 



(27) 
(28) 
(29) 



Ri 



subject to, 



where, 



C 21 > I(U-Y 2 ) 
Ci 2 > I{V;Yi) 



I(V;Y 2 ), 



(30) 
(31) 



R(U) = I(U;Y U U), (32) 
R(V) = I(V;Y 2 ,V), (33) 

for some joint distribution p(u,v,x,yi,y2,u,v) = 
p{u,v,x)p(yi,y 2 \x)p(u\y 2 )p(v\yi), is achievable, with 
u e U,v e V,u e U,v e V, \\U\\ < \\y 2 \\ + 1 and 
||V||<M + 1. 

In the next subsections we provide the proof of this theorem. 

'it can be shown that I (U \Y{) — I (U \Y2) = for the degraded channel 
setup implies that if Rq + R2 > then H(Y\\Y2) = 0, i.e. the channel 
from R x i to R x 2 is invertible. Under these circumstances, this setup can be 
replaced by an equivalent setup in which both receivers get Yi, but such a 
degenerate setup is not interesting. 
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A. Overview of Coding Strategy 

As in the achievability part of theorem 1, the proposed 
code is a hybrid broadcast-relay code. Here, we combine the 
relay code construction of [34, theorem 6] and the broadcast 
code construction of [15]. The fact that in these two theorems 
the channel encoding and the relay operation are performed 
independently, allows to easily combine them into a hybrid 
coding scheme. The encoder generates broadcast codewords, 
each selected from a codebook constructed similarly to the 
construction of [15]. This codebook splits the rate between 
the two users. Next, each relay (R x i acts as a relay for 
R X 2 and vice-versa) generates its codebook according to the 
construction of [34, theorem 6]. In the decoding step, using 
the received signal (Y™ at R x i and Y 2 n at R X 2), each receiver 
generates a list of the possible transmitted relay messages 
and uses the conference message from the next time interval 
to resolve for the relay massage. Then, each receiver uses 
the decoded relay message and its received channel output to 
decode its own message. 

B. Encoding at the Transmitter 

1) Let e > and n > 1 be given. Fix p(u, v, x), p(u\y-2) 
and p(v\yi), and let <5 > be a positive number, whose 
selection is described in the next item. Let Ag (U) 
denote the set of strongly typical i.i.d. sequences of 
length n, u G U n , as defined in [43, Ch. 13.6]. 
Let A*S iy) denote the set of strongly typical i.i.d. 
sequences of length n, v € V". Let S^} g denote the set 

of all sequences u G A$'(U), such that Ag^ n \v\u) 
is nonempty as defined in [47, corollary 5.11], and 
similarly define S^yh for the sequences v G A$\v). 

2) Select 2 n ^ Rt ^ u ^~^ strongly typical sequences u in an 
i.i.d. manner, according to the probability 



p(u) 



W S IU]6\ 





' u fc D [U]S 
, otherwise. 



Label these sequences by u(k), k G 1, 2" (fl(c/) " e) 

Select 2 n< -- R ( v )~ e * > strongly typical sequences v in an 
i.i.d. manner, according to the probability 



,v G S 



(n) 
[V]S 



p(v) = < ll^ll 

[ , otherwise. 

Label these sequences by v(Z), I G [l,2 n W)- e )]. 
Note that from [47, corollary 5.11] we have that 



\S 



, >tU\ > (l-6)2 n( - H( - u ^\ where ip -> as n -» oo 
and o — > 0, so for any e > we can always find < 

> 



5 < e such that for n large enough we obtain 1 1 Sn$ s 



\S 



(n) 



> 2 



n(l{V;Y 2 ,V)-c) 



ri^YuU)-.) and n 

3) Define the cells Bi 

i G [l,2 niil ]. This is a partition of the u sequences 
into 2 nRl sets. Define the cells 

Cj = {{j-i)2< R <y)- R *-<-) + i,j2< R (y)- R *-% 



j G [l, 2 nR2 ~\ , which form a partition of the v sequences 
into 2 nR2 sets. 

4) For every pair of integers (wi,W2) G [l^™^ 1 ] x 
[l,2 nR2 ], define the set V Wl . u , 2 = |(u(fc),v(/)) : 

k G B Wl , I G C W2 , (u(fc),v(0) 6 A* e {n) {U,V)\ . Here, 

A*J" n \u, V) denotes the strongly typical set for the 
random variables U and V as defined in [43, Ch. 13.6]. 
In the following we may omit the random variables when 
referring to the strongly typical set, when these variables 
are clear from the context. We now have the following 
(slightly modified) lemma from [15]: 
Lemma 1: For any 2-D cell Bi x Cj, e > 0, and n large 
enough, we have that Pr (| \D%j \ | = 0) < e, provided that 

R1+R2 < R{U) + R{V) - I{U\ V) —2e— e u (34) 

where t\ — > as e — > and n — > 00. 

Proof: The proof of this lemma is obtained by 
direct application of the technique used to prove [15, 
Lemma in pg. 121], and therefore will not be repeated 
here. ■ 

5) For each message pair (wi,W2), select one pair 
(u(k Wl . W2 ),v(l WuW2 )) G V Wl . W2 . For each of the 
selected pairs (one pair for each message pair), 
generate a codeword according to x.(w 1,11)2) ~ 

Oi=l P |^i(^KJl ,W2 ) 7 (fwi ,W2/)' 

6) To transmit the message pair (11)1,11)2) the transmitter 
outputs x(iwi, W2). 

C. Encoding the Relay Messages 

Consider first the relay encoding at R x2 , which acts as a 
relay for R x \. 

1) -Ra;2-relay has a set of 2 nC ' 21 relay messages indexed by 
s' G [l,2 n ° 21 ]. For each index s', generate 2 nR ' i.i.d. 
sequences u, each with probability p(u) = Yi7=i P[^i)> 
P(u) = ^x,y u y 2 P{u\y2)p(yi,y2\x)p{x), and p(x) = 
^ HV p(u,«,i). Label these codewords u(V|s'), s' G 
[l,2 nC21 ], z' G [l,2 nR ']. 

2) Randomly and uniformly partition the message set 
[l,2" fl '] into 2 nC2i sets S' 3 „ s' G [l,2 nC21 ]. 

3) Encoding: Assume that after receiving y2(i — 1) we 

have at R x2 that (ii(^_ 1 |*J_ 1 ),ya(* - 1)) G A* e (n) , 
and that z' i _ l G S ', (s^_ x is known from the previous 
transmission of z' i _ 2 )- Then, at the i'th transmission 
interval the relay transmits the index to R x ±. 
Relay encoding at R x i is performed in a symmetric manner 
to the relay encoding at R X 2- The corresponding variables for 
R xl are S'J„ and v(z"\s"), s" G [l,2 nCf »], z" G [l,2 nR "\. 

D. Decoding the Relay Messages at the Relays 

Consider decoding the relay message at R X 2- The relay 
decoder at R X 2 uses its channel input y2(i), and its previously 
decoded s' t to generate the relay message z[ as follows: upon 
receiving y 2 it), the relay R X 2 decides that the message z[ was 
received at time i if (u(z||s^) , y2(i)) G At . Following the 
argument in [34, theorem 6] (see also the proof in [43, Ch. 
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13.6]), there exists such z[ with probability that is arbitrarily 
close to one as long as 

B!>I (if; y 2 ) , (35) 

and n is sufficiently large. Relay decoding at R x \ is done in 
a symmetric manner to the relay decoding at R x2 . 

E. Decoding at the Receivers 

We first find the rate constraint for decoding at Rxi. Rxi 
decodes its message wJx,i-i based on its channel input yi(i— 1) 
and the relay indices s' ; and s^_ x : 

1) From knowledge of s' i _ 1 and yi(i — 1), R x i calculates 
the set C\(i — 1) such that 

Cx{i-l) = {z'e[l,2 nR '] : 

(u( z 'K_x),yi(i-i))e^}. 

2) At the time interval of the i'th codeword, R x \ receives 
the relayed s[. Since is selected from a set of 
2«c 2 i p 0SS ible messages, it can be transmitted over the 
noiseless conference link without error. 

3) R x i now chooses z' i _ 1 as the relay message at time i — 1 
if and only if there exists a unique z' i _ l e S', P| L\(i — 
1). Again, following the reasoning in [34, theorem 6], 
this can be done with an arbitrarily small probability of 
error as long as 

R <I0;Y x ) + C 2U (36) 

and n is large enough. Combining this with inequality 
(35) we get the constraint on the relay information rate: 

C 2l >I(U; Y 2 )-I(U;Y 1 ) . (37) 

This expression is similar to the Wyner-Ziv expression 
for the rate required to transmit Y 2 to receiver R x \ up to 
a given distortion, determined by p(u\y 2 ) and a decoder. 
Here the performance of the decoder are implied in the 
mutual information I(U ; Y\, U). The compressed Y 2 n is 
then used by R x \ to assist in decoding W\. 

4) Lastly, R x \ decodes ifx.-s-i (or, equivalently 
"(fciuM-i.uiiM-i)) b y choosing u(fc dl)j _ 1 ^ 2i _ 1 ) such 
that (u(fc dil|4 _ Iitfl3i4 _ I ),yi(£-l),fl(^_i|Si_i)) £ 
A*J~ n \ From the point-to-point channel coding theorem 
(see [15]) we have that u)x,i-i = W\,i-\ with probability 
that is arbitrarily close to one, as long as z' i _ 1 was 
correctly decoded at R x i and 

Ri < R(U) ± I (U;Y u tr} , (38) 

for sufficiently large n. Combining this with equation 
(37) yields the rate constraint on R\: 

Ri < R(U), (39) 
as long as C 21 > 10; Y 2 ) - 10; F x ). (40) 

Using symmetric arguments to those presented for decoding 
at R x \ we find the rate constraint for R x2 to be 

i? 2 < R(V), (41) 
as long as C 12 > 10; Y x ) - 10; Y 2 ). (42) 



Combining equations (34), (39), (40), (41) and (42), gives 
the conditions in theorem 2. 

F. Error Events 

In the scheme described above we have to account for the 
following error events for decoding (wi t i-i,w 2 ,i-i)'- 

1) Encoding at the transmitter fails: 

Ed,i = {||£>u;i,,_i,uj2,i_i|| = 0}. 

2) Joint typicality decoding fails: 

x(tux,i-i, w 2t i-i), yx(i-l),y 2 (i-l)) £ 

3) Decoding at the relays fails: E\^ = E\\,i [J E\ 2> i, 
En,i = 6 [l,2 nR '\ s.t. 

(u(,'| s U),y 2 (i-l)) e^" 1 }, 

Ev2,i = {$z" G [1,2"*"] s.t. 

(v(2"K'_i),yi(i-i))G^ (n) }- 

4) Decoding the relay message at the receivers fails: E 2 ,i = 

E 21 ,i U E 22ti , where = E' 21l [j E' 2 \ 4 and E 2%i = 

E 22,i U S 22,i' 

E' 21 ,i = U_ x tS'^&li-l)}, 

E' 2 \,i = h& ^ zu s.t. ~z' G s; s nA(i - 1)}, 
^2,i = |^- 1 ^^,nA(i-i)}, 

= p" ^ <_i s.t. z" e S'J,, n £ 2 (i - i)}, 

C 2 (i-l) 4 |z" G [1,2"*"] : 

(v(z"|^-i),y 2 (*-i))^: ( " ) }- 

5) Final decoding at the receivers fails: 

E 3 .i = ^3M U-^i, where, 

E 31,i = {(u(fe„, M _ 1 ,. u , :M _ 1 ),yi(i - 1), 

aCCxM-i)) g a: ( ' i) }u{3^ ^ s.t. 

(u^^J.yxCi-l),^.!!^-!)) G 
^32, i = {(v(Z TO1<i _i, TOaii _i),y2(i - 1), 
v(^LilCi)) £ A: ( ' i) }u{3 W2 ^ fwa.i-i s.t. 

(v(i mi , m! )j 2 ( ! -i),v( z ;'_ 1 | s ;'_ 1 ))eA e * ( " ) }. 

We now bound the probability of the error events at time 
i. Note that at time i both R x \ and R x2 share the same 
s' i _ 1 and s"_j irrespective whether the decoding at the re- 
lays was correct at time i — 1. Hence, a decoding error at 
time i — 1 does not affect the decoding at time i. Now, 
from lemma 1 it follows that by taking n large enough the 
probability of Eu,i can be made arbitrarily small, as long as 
(34) is satisfied. Additionally, by taking n large enough, the 
probability Pr(i?o,i C\Ejj i ) can be made arbitrarily small by 
the properties of strongly typical sequences, see [43, lemma 
13.6.2]. The probability Pr(Ei^) can be made arbitrarily 
small as long as (40) and (42) are satisfied, as explained is 
section IV-D. Next, the Markov lemma [50, lemma 4.2] and 
the Markov chains Y\ — Y 2 — U and Y 2 — Y\ — V, imply 
that Pr^n^n^o.i) and Pr(^ 2ji D E^ f| E^) can 
be made arbitrarily small by taking n large enough, and 
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Pl ( E 2i,i D E i,i) and Pr ( E 22,i D can be made arbitrarily 
small by taking n large enough as long as (40) and (42) are 
satisfied. Finally, Pr(E 3hi D^ifl ^0^(1^) and 
Pr(£ 3 2,i fl S 2,i #o,i n can be made arbitrarily 
small by taking n large enough by the Markov lemma and the 
chains U, Y\ — Y% — U and V, Y 2 — Y\ — V\ and as long as 
(39) and (41) are satisfied. 

This concludes the proof of theorem 2. ■ 

G. An Upper Bound 

Proposition 1: Assume the broadcast channel setup of theo- 
rem 2. Then, for sending independent information, any achiev- 
able rate pair (R X ,R 2 ) must satisfy 

Ri < I(X-Y 1 ) + C 2 u 
R2 < I(X;Y 2 ) + C 12 , 
R1+R2 < I{X;Y U Y 2 ), 

for some distribution p(x) on X. 

Proof: The proof uses the cut-set bound [43, theorem 
14.10.1]. First we define an equivalent system by introducing 
two orthogonal channels X 2 —Y{ from R x2 to R x i and X[—Y 2 
from R x i to R x2 . The joint probability distribution function 
then becomes 

p{(yi,y'i), {y2,y 2 )\x,x' 1 ,x 2 ) = p{yi,y2\x)p(y' 1 \x 2 )p(y 2 \x' 1 ), 

where the signal received at R xl is (Yi,Y[) and the signal 
received at R x2 is (Y 2 ,Y 2 ). As in the proof in section 
III-A.3, we select X[, X 2 , y{, y' 2 , p(x[), p(x' 2 ), p(y[\x' 2 ) and 
Piy^Wi) sucn tn at the capacities of the channels X 2 — Y[ 
and X[ — Y 2 are C 2 \ and C% 2 respectively. Additionally, 
the codewords for the conference transmissions are deter- 
mined independently from the source codebook so we set 
p(x,x' 1 ,x' 2 ) = p(x)p(x' 1 )p(x' 2 ). Now, from the cut-set bound, 
letting the transmitter and R x2 form one group and R x i the 
second group, we have 

Rt < I(X,X^Y U Y{\X[) 

= I(X 2 ; Y l ,Y(\X[) + I{X; Y 1 , Yl\X[ , X' 2 ) 

= IiX^YllX'^+IiX^lXiX) 

+ I(X: Y{\X[,X' 2 ) + I(X; Yr\X[, X 2 ,Y{) 

= I{X' 2 -Y[) + /(x ; y x ) 
= C^+IiX-Y^, 

where I(X 2 ;Yi\X' 1 ,Y{) = I(X; Y{\X[, X' 2 ) = follows 
from direct application of the distribution function. Similarly 
we obtain the rate constraint on R 2 . Lastly, for the sum-rate 
consider the transmitter in one group and the receivers in the 
second. Then, the cut-set bound results in 

R1+R2 < I(X;Y u Y 2 ,Yl,Y 2 '\X[,X 2 ) 
= 7(X;H,y 2 |X{,X0 

+ I(X;Y{,Yi\X{,X 2 ,Y 1 ,Y 2 ) 
= I(X;Y U Y 2 ), 



H. Remarks 

Comment 4.1: Observing the rate constraints in theorem 2 
we can see that when (30) and (31) are satisfied then the 
cooperative rates are greater than the non-cooperative rates 
due to the (generally) positive terms adding to I(U;Y{) and 
I(V;Y 2 ). 

Comment 4.2: We note that although we present a single 
letter characterization of the rates, we are not able to apply 
standard cardinality bounding techniques such as those used 
in [48] or [49] for bounding ||W|| and ||V||. The method of 
[48] cannot be applied since it relies on the fact that the 
auxiliary random variables are independent, which is not the 
case here. The method of [49] cannot be applied as explained 
in the comment for theorem 2 in [20]. The cardinality bounds 
on \\U\\ and ||V|| are trivial since they are transmitted over 
noiseless links. 

Comment 4.3: The relay strategies can be divided into two 
general classes. The first class is referred to as decode-and- 
forward (DAF). In this strategy, the relay first decodes the 
message intended for the destination and then generates a 
relay message based on the decoded information. The second 
class is referred to as estimate-and-forward (EAF). In this 
class the relay does not decode the message intended for the 
destination but transmits an estimate of its channel input to the 
destination. For the physically degraded BC we used DAF, 
based on [34, theorem 1], to derive theorem 1, and for the 
general BC we used the EAF scheme of [34, theorem 6], 
to derive theorem 2. Of course, one can also combine both 
strategies and perform partial decoding at each receiver of the 
other receiver's message before conferencing, following [34, 
theorem 7]. This combination will, in general, result in an 
increased achievable rate region. 

/. Special Cases 

1) No Cooperation: C\ 2 = C 2 \ = 0: Consider first 
cooperation from R x2 to R x \. Setting C21 = in theorem 
2 implies that 

H(U\Yl) = H{U\Y 2 ). (43) 

From equation (32), the constraint on R\ can be written in the 
form 

Ri < I(U;Yi) + I(U;U\Yi). 

Now we find l{U:U\Yi): 

I(U;U\Yi) = H(U\Y X )-H(U\Y X ,U) 

( = } H(U\Y 2 ) — H(U\Yx, U) 

( = } H(U\Y 2 ,Y U U) - H@\Y lt U) (44) 
= -I(U;Y 2 \Y U U). 

where (a) is due to (43), and (b) is due to the Markov chain 
U-(U, V)-X-{Y X ,Y 2 )-Y 2 -U, which implies that given Y 2 , 
U is independent of Y\ and U. Now, since mutual information 
is non-negative, we conclude that I(U ; C^y) = 0. Hence, the 
rate constraint on i?i becomes 



yielding the last constraint in the proposition. 



Ri <W;Y{). 
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Similarly, the maximum rate R2 is given by I(V; Y2), and in 
conclusion when C\i = C21 = we resort back to the rate 
region without cooperation derived in [14] (with a constant 
W). 

2) Full Cooperation: C 12 = H{Y X \Y 2 ), C 2 i = H(Y 2 \Y 1 ): 
When C12 = H{Y 1 \Y 2 ), we get from (31) that 

H{Y l \Y 2 ) = C l2 > 7(V;Yi)-/(K;y 2 ) 
= H(V\Y 2 ) - HiVlYi), 

which is satisfied when V = Yy, Plugging this into (33), we 
get that when full cooperation from R x i to R X 2 is available, 
the rate constraint for R x2 becomes 

R 2 <I(V;Y 2 ,Y 1 ). 

Using the same reasoning we conclude that when full coop- 
eration from R X 2 to R x i is available, the rate constraint for 
R xl becomes R X < I{U\Y X ,Y 2 ). 

3) Partial Cooperation: When < Cyi < H(Yi\Y2) and 
< C21 < H{Y 2 \Y{), we get that 

C21 > H(U\Y X ) - H(U\Yi) 
H^Yt) < C21 + H(U\Y 2 ). (45) 

Hence, the achievable rate to R x \ is upper bounded by 

Ri < I(U;Yx,U) 

= I(U;Y!) + I(U;U\Yi) 

= IiU-^) + H{U\Yi) - H{U\U,Y t ) 

H(U\U,Y 1 ) + C 2 i 
I(U; Y x ) + H(U\Y 2 , Y t , U) - H(U\U, F) + C 21 
Ri < I(U;Y 1 ) + C 2 i-I(U;Y 2 \U,Y 1 ). (46) 



< 1(11^) + H(U\Y 2 ) 



where (a) is due to (45) and (b) follow from the same reasoning 
leading to equation (44). Similarly, R 2 < I(V;Y 2 ) + C\ 2 - 
I{V- Yx \V,Y % ). 

Note that there exist negative terms —I(U; Y 2 \U, Y\) 
and — I(V; Yi\ V, Y 2 ) in the achievable rate upper bounds. 
This can be explained as follows: the mutual information 
I(U;Y 2 \U,Yi) can be considered as a type of "ancillary" 
information that U contains, since this information is con- 
tained in U while U and Yj are already known - therefore, 
this information is a "noise" part of Y 2 which does not 
include any helpful information for decoding U at R xX . Thus, 
for cooperating in the optimal way, U has to be a type of 
"sufficient and complete" cooperation information. 



evaluate the achievable region. For the single common mes- 
sage case, we are able to derive results for partial cooperation 
without auxiliary variables, which make this region explicitly 
computable. This scenario is depicted in figure 4. 
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Fig. 4. The single message broadcast channel with cooperating 

receivers. W and W are the estimates of W at R x i and R X 2 
respectively. 

For this scenario we need to specialize the definitions of a 
code and the average probability of error as follows: 

• A (2 nR , n, (C12, C21)) code for sending a common 
message over the broadcast channel with cooperating 
receivers having conference links of capacities C\ 2 and 
C21 between them, is defined in a similar manner to 
definition 6 with Wi, W 2 and Wi x W2 all replaced 
with W = {l,2,...,2 nR }. 

• The average probability of error is defined similarly to 
definition 7 with W\ and W2 replaced with W . 

The capacity for the non-cooperative single message sce- 
nario is given in [5] by 



C 



sup f min (I(X; Fx), I(X; Y 2 )) }. (47) 



p(x) 



In the following we consider two cooperation schemes, re- 
ferred to as a single-step scheme and a two-step scheme. These 
schemes are described in figure 5. In the single-step scheme, 
after reception each receiver generates a single cooperation 
message based on its channel input. In the two-step scheme, 
after reception one receiver generates a cooperation message 
based only on its channel input, as in the previous case, but 
the second receiver generates its cooperation message only 
after decoding (which is done with the help of the conference 
message from the first receiver). In both cases each receiver 
generates a single conference message, however in the single- 
step conference the emphasis is on low delay, while in the 
two-step conference we sacrifice delay in order to gain rate. 



V. The General Broadcast Channel with a Single 
Common Message 

We now consider the case where only a single message, 
rather than two independent messages, is transmitted to both 
receivers. The main motivation for considering this case is 
that in the two independent messages case it is difficult to 
specify an explicit cooperation scheme, and we therefore have 
to represent cooperation through auxiliary random variables. 
Hence, we cannot identify directly the gain from cooperation, 
except in the case of full cooperation, and we also cannot 



y,(.) 



y : (.) 



W,.(y,(i)) 




W„(y,(i)) 



R, 



Reception Confc 
Single Step Conference 



y.ti) 



Time i 



y,(i) 
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Fig. 5. Schematic description of the single-step and the two-step 
conference schemes. 
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A. Decoding with a Single-Step Cooperation 

In this section we constrain both decoders to output their 
decoded messages after a conference that consists of a single 
message from each receiver, based only on its received channel 
input. For this case, we can specialize the derivation of theo- 
rem 2 and get the following achievable rate for the broadcast 
channel with partially cooperating receivers: 

Theorem 3: Let (X,p(yi, y 2 \x), 3\ X ^2) be any discrete 
memoryless broadcast channel, with cooperating receivers 
having noiseless conference links of finite capacities G\ 2 and 
C21, as defined in section II. Then, for sending a common 
message to both receivers, any rate R satisfying 



R < supivaia{l{X;Y u U),I(X;Y 2 

p(x) 



V 



)}}, 



subject to 



C21 > I(U;Y 2 ) 
Cu > I{V;Yi) 



HUM), 



for some joint distribution p(x,yi,y 2 ,u,v 
p(x)p(yi : y 2 \x)p(u\y 2 )p(v\yi) is achievable, 
||W||< ||3^|| + 1 am/ ||V||< 113^11 + 1. 

The proof of theorem 3 follows the same lines of the proof 
of theorem 2 and will not be repeated here. We next show 
how we can increase the rates by introducing the two-step 
conference. 

B. Decoding with a Two-Step Cooperation 

We consider a two-step conference: at the first step only one 
receiver decodes the message. The second receiver decodes 
after the second step. Therefore, after the first receiver decodes 
the message, relaying to the second receiver reduces to the 
decode-and-forward relay situation of [34, theorem 1]. The 
rates achievable with a two step conference are given in the 
following theorem: 

Theorem 4: Assume the broadcast channel setup of theorem 
3. Then, for sending a common message to both receivers, any 
rate R satisfying 



with anc i 



R < sup 

p(x) 



:{R™(p(x)),R 2 Hp(x))} 



R 12 (p(x)) 4 min J(X;y 1 )+C 21 , 



I{X;Y 2 )-I{V;Yx\Y 2 ,X) 

+ min(C 12 ,H{V \Y 2 ) - ff(F|Fi)) J , 



R 21 {p{x)) 4 m in (l(X;Y 2 )+C 12 , 



I{X;Yi)-I{fi;Y 2 \Y u X) 
+ min(C 21 ,H(U\Y 1 )-H(U\Y 2 )) ), 



for some joint distribution p{x,y\,y 2 ,u 1 v) = 
p(x)p(yi,y 2 \x)p(u\y 2 )p(v\yi) is achievable, with 
< 113^11 + 1 and \\V\\ < H3MI + 1, and with the 



appropriate C l2 > I<y;Yi\Y 3 ,X) or C 2X > I(U;Y t \Y u X) 
(the one used for the first cooperation step). 

Proof: 

1 ) Overview of Coding Strategy: The scheme described in 
theorem 3 uses a single-step conference for both decoders. 
However, if we let one receiver use a two-step conference, then 
that receiver, instead of using conference information derived 
from the raw input of the other receiver, can use information 
generated by the second receiver after it already decoded the 
message. This conference information is less noisy, and thus 
the rate to the first receiver can be increased. 

To put this in more concrete terms, assume that at time 
R x i sends to R x2 the index s' i+1 of the partition into which 
its relay message at time i, denoted z$ t i, belongs. In appendix 
B we show that R x2 can decode the message ioq » with an 
arbitrarily small probability of error as long as 

R < IiX-Y^-IiV-Y^X) 

+ min (C 12 ,H{V\Y 2 ) - H(V\Y{j) , (48) 

Cu >I{V;Y X \Y 2 ,X). (49) 



We now introduce the following modifications to the scheme 
used in theorem 3: 

2) Relay Sets Generation at R x2 : R x2 partitions the mes- 
sage set W into 2 nC21 subsets in a uniform and independent 
manner. Denote these subsets with S~„, s" £ [l, 2 ai l. 

3) Relay Encoding at R x2 : R x2 has an estimate wo.i of 
the message iuo,j- Now, R x2 looks for the partition into which 
u>o,i belongs and sends the index of this partition, denoted 
s" +2 , to R x i at time i + 2. 

4) Decoding at R x \: Upon reception 
of Yx{i), Rxi generates the set = 

G W : (x(u;),yi(i)) g A £ * (Tl) (X,Yi)}. At time i + 2, 
upon reception of ~s" +2 , R x \ looks for an index w such that 
w G £i(i)f)S~n . If a unique such w exists then R x \ sets 
wo.i = w, otherwise an error is declared. 

5) Bounding the Probability of Error: Using the proof tech- 
nique in [34, theorem 1], it can be easily shown that assuming 
correct decoding at R x2 , then any rate R < I(X;Yi) + C 2 \ 
is achievable to R x \. 

Combining the bounds derived above, we conclude that with 
a two-step conference at R x \, any rate satisfying 

R < min (l(X-,Yi) + C2i,I(X-,Y 2 )-I(V-,Yi\Y2,X) 



+ mm(C 12 ,H(V\Y 2 ) - HiVlY,)) 
Ci 2 > IiV-Y^X), 



ia achievable. Repeating the same derivation when R x2 uses 
a two-step conference, and combining with the previous case 
proves theorem 4. ■ 
Setting U = Y 2 , V = Y\ in theorem 4 we obtain the 
following achievable region: 
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Corollary 1: Assume the broadcast channel setup of theo- 
rem 3. Then, for sending a common message to both receivers, 
any rate R satisfying 



R 



R < sup 

p(x) 



max 



{r 12 (p(x)),R 21 ( P (x))} 



R 12 {p(x))±mm [l(X;Y l ) + C 2 i,I{X;Y 2 )-H{Y l \Y 2 ,X) 



min (Cia.-HXni^)) 



R 2L (p(x))^mm I(X- Y 2 ) + C 12 ,I(X; - H{Y 2 \Y X ,X) 



min (Cai.^C^l^i)) 



with the appropriate C\ 2 > H(Yi\Y 2 ,X) or C 2 \ > 
H(Y 2 \Yi, X) (the one used for the first cooperation step), is 
achievable. 

This gives a partial cooperation result without auxiliary ran- 
dom variables. 

C. An Example for Corollary 1 

Consider two independent, identical, BSBCs with transition 
probability p, and cooperation links of capacities C\ 2 = C 2 \ = 
C. For this case, corollary 1 gives the following maximum 
achievable rate: 



R = sup ^ mm 

pa 



H(Yi) - h{p) + C, 

min (H(Yi) + C, H(Y 1 ,Y 2 )) - 2h(p) 



Pr(yi) 



= sup <^ min H(Y{) - 2h(p) + C, ff(Yi, Y 2 ) - 2h(p) 

Po I 

for C > h<j>), where y 1 = y 2 = X = {0, 1}, p Q = Pr(X = 
0), and 

( (1 -p) 2 p +p 2 il -pa), yi = y 2 = 
Pr (yi,Z/2) = < P(l-P), 2/17^2/2 

[p 2 po + (l-p) 2 il-p ), 2/1=2/2 = 1 

(1 -p)po+p{l - Po), 2/1=0 
ppo + (1 -p)(l - Po), 2/i = 1- 
Solving for the supremum for each value of C, we get the 
achievable rates depicted in figure 6. Note the linear increase 
in the achievable rate for HiY 2 \Y u X) < C < H(Y 2 \Yi). 

D. An Upper Bound 

The upper bound for the single common message case can 
be obtained from the bound for the two independent messages 
case in proposition 1: 

Corollary 2: Let (A",p(yi, y 2 \x), y± x 3^2) be any discrete 
memoryless broadcast channel, with cooperating receivers 
having noiseless conference links of finite capacities C\ 2 and 
C 2 \, as defined in section II. Then, for sending a common 
message to both receivers, any rate R must satisfy 



R < sup { min (/(X; Y x ) + C 21 ,I(X; Y 2 ) + C\ 

p(x) 

I(X;Y U Y 3 ))}. 



I(X;Y,,Y^ 



I(X;Y) 



H(Y 2 Y,» 



H(YY), 



c 



Fig. 6. Achievable rate vs. C, for the two independent, identical, 
BSBCs with a single common message, resulting from corollary 1. 



Proof: Follows directly from proposition 1 by noting 
that the common rate has to satisfy all three constraints: the 
individual rates and the sum rate. ■ 



E. Remarks 

Comment 5.1: Note that there are special cases where the 
lower bound of corollary 1 coincides with the upper bound of 
corollary 2, yielding the capacity for these cases. For example, 
assume a strong version of the "more capable" condition of 
[5]: IiX; Yi) >> /(X; Y 2 ) 2 for all input distributions p(x) on 
X . Assume also that h\y 2 \Y u X) < C 2X < HiY 2 \Y{) and 
HiYi\Y 2 ,X) < C12 < H(Y!\Y 2 ). Under these conditions, we 
have that I{X\Y X ) + C 21 > I{X;Y 2 ) + C 12 - HiY^Y^X). 
Thus, if R x i is helping R x2 first, the achievable rate is 
I{X-Y 2 ) + C12 - H{Yi\Y 2 ,X). If R x2 is helping R xl first, 
then the achievable rate is IiX;Y 2 ) + C\ 2 . Since C\ 2 — 
H(Yi\Y 2 , X) < C12, this cooperation scheme achieves the 
upper bound R = sup p ( x ) {I{X; Y 2 ) + C 12 }. 

Comment 5.2: Note that the capacity region for the deter- 
ministic broadcast channel with cooperating receivers follows 
from corollary 1 and corollary 2. This region was derived in 
[51]. For this case we have that H(Yi\X) = H(Y 2 \X) = 
hence IiX; Yi) = H(Yi), i = 1,2. The achievable rate (from 
corollary 1) is given by 

R < min{ J ff(l2) + Ci2,i?(Yi)+min(C 2 i,i?(r2|ll))} 
= min {HiY 2 ) + C 12 ,H(Y X ) + C 21 , ff(Yi, Y 2 )} , 

and the same from corollary 2. 

Comment 5.3: We note that although the expressions 
in (48) and (49) seem different from the EAF expres- 
sion of [34, theorem 6], given in theorem 3 (cf. R < 
IiX;Y 2 ,V), subject to C 12 > I(V; Yi)-/(V r ; Y 2 )), this does 
not improve on the achievable rate of the standard EAF. The 
reason is that every rate achievable according to (48)-(49) 
can also be achieved with the standard EAF using the same 
mapping of the auxiliary RV and an appropriate time-sharing 3 . 

2 The precise condition requires that I(X; Y±) > I(X; Y2) + C12 — C21 + 
H(Y2\Yi, X) for all input distributions p(x). 

3 This observation is due to Shlomo Shamai and Gerhard Kramer. 
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However, when considering a specific, fixed assignment of the 
auxiliary random variable (such as in corollary 1) then the 
rate achievable with (48)-(49) is indeed greater than the classic 
EAF with the same assignment. 

VI. Conclusions 

In this paper we investigated the effect of cooperation 
between receivers on the rates for the broadcast channel. As 
communication networks evolve, it can be expected that in 
future networks, nodes that are close enough to be able to 
communicate directly, will use this ability to help each other 
in reception. Accommodating this characteristic, we extended 
the traditional broadcast scenario, in which each decoder is 
assumed to operate independently, into a scenario where the 
receivers have finite capacity links used for cooperation. We 
analyzed three related scenarios: the physically degraded BC 
- for which we derived the capacity region, the general BC 
for which we presented an achievability result, and the single 
common message case. For the last case we identified a special 
case where capacity can be achieved. We note that it is not 
trivial to extend these results to more than two steps, since 
the intermediate steps need to extract information from partial 
relay information. Although this can be done by introducing 
additional auxiliary variables, obtaining a computable region is 
not a simple task. This study is an initial step in this investiga- 
tion and future work includes several extensions: a natural first 
extension is to consider a fully wireless system, and extend the 
analysis to the Gaussian case. Another extension is to consider 
the interaction between the Wyner-Ziv compression and the 
achievable rates for the general channel. 
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Appendix A 
Background Results 

Consider the construction in section III-A. Let C(i — 

1) = {w 2 : (y 2 (* - l),uMs t -i)) G A { J l) Y We bound 



E y2 {\\£(i-l)\\}. Let, 

i> (w 2 |y 2 (i - 1)) = { lj ( u ( w 2h-i),y 2 (* - i)) e 

vy ly V " \0, otherwise. 



-4 



(n) 



Hence, as in [34, theorem 1], we can write the random variable 
\\C(i — 1)|| as a sum of random variables: 

||£(i-i)||= £ v(™ 2 |y 2 (i-i)), 



w 2 = l 



and therefore 



R 



{||r(i-i)||} = ^w^wWi-i))} 



J2 £ y2 W>(« 2 |y 2 (i-l))}. 



When W2 ^ w^i-x we get from the properties of independent 
sequence ([43, theorem 8.6.1]) that 

E y2 {^(«72lya(i-l))} = Pr{V>(™ 2 |y 2 (i-l)) = l} 

< 2 - n (AU;Y 2 )-3e)^ 

thus, 



R 



y 3 {||£(i-l)||} < i + 2 nR *2- n{I{u > Y ^-^\ (A.l) 

Note that this result holds also when considering the strongly 
typical set rather than the weakly typical set. 

Appendix B 
Proof of the Achievable Rate to the First 
Decoder in Theorem 4 (equations (48) and (49)) 

A. Overview of Coding Strategy 

The encoder generates a single codebook in a random 
and independent manner. Next, the first relay partitions its 
collection of relay codewords (Z(V) for R x \) into disjoint 
sets. When a channel input is received, the first relay finds 
the index of the partition set which contains a relay codeword 
jointly typical with its channel input, and transmits it over 
the noiseless conference link to the second receiver. Then, the 
second receiver looks for a unique source codeword that is 
jointly typical with its channel input, and with at least one of 
the relay codewords in the set of possible codewords received 
from the first relay. 

In the following analysis we assume that R x i is the first 
relay and R X 2 decodes first. 

B. Codebook Generation and Encoding at the Transmitter 
Fix p(x) and generate 2 nR i.i.d. codewords x, with 

p(xH) = FEU PfoH)> w e W = {1,2,. ..,2 nR }. For 
transmitting the message wo,i at time i, the transmitter outputs 
x (w ; o,i) to the channel. 

C. Relay Sets Generation 
Fix p(v\yi). 

• Consider the p.d.f. p(v) = 
T,x,y u y 2 P(v\yi)p(yi,y 2 \x)p{x) on V. 

• R x i generates 2 nRl v sequences in an i.i.d. manner 
according to p(v(z$)) = F^Li p(vi(zv)), z v <E Z(V) = 
{l,2,...,2» fl i}. 

• R x i partitions the message set Z(V) into 2 nCl2 sets, 
by assigning an index between [l,2 raC,12 l to each z% <E 
Z(V), in a random, independent and uniform manner 
over [l, 2 nCl2 ] . Denote these sets by S' 3 „ s' e [l, 2 nCl2 ] . 



w 2 = l 

W 2 ^W 2 i-± 



D. Decoding and Encoding at the Relay (R x \) 

« Upon reception of yi (i), the relay R x \ decides that z$ t i £ 
Z(V) was received if (v(z e ,i), yi(i)) G A* t [n \v , Yi). 
Now, R x i finds the index s' A , n of the set S' , s.t. Zy < S 

'^r 1 s i+l 

S' , . Then, at time i + 1, R x \ transmits s' , , to R X 2 

s i+i 

through the finite capacity noiseless conference link. If 
there is no z v G Z(V) such that v(z v ) is jointly typical 
with yi(«), an error is declared. 
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E. Decoding the Source Message at R X 2 

At the z'th transmission interval R X 2 generates the set 
C 2 (i) = {we W: (x( W ),y 2 (z)) e A: (n) (X,F 2 )}. At the 
(i + l)'th transmission interval, i? x2 receives s' i+1 from R x i 
through the noiseless conference link. R X 2 then looks for a 
unique wq s.t. wq G £2(2) and 3zy G S' s , , for which 

(x(u) ),y2(i),v(zfi)) G y 2 , t^). If such unique u) 

exists, then wo is the decoded message at time i. If there is 
none, or there is more than one, an error is declared. 



F. Analysis of the Probability of Error 

1 ) Error Events: The error events for the scheme described 
above, for decoding the message Wo,%, are: 

1) Relay decoding fails: 
E ,i = G Z(V) s.t. 

(v(z fi ),yi(*))GA: ( " ) (l/,y 1 )}- 

2) Joint typicality decoding fails: Let E 1;i = E' li \jE" i , 
where 

E[ ti = Ux(wo,i),yi(i),y2(i))?Ay n \x,Y u Y 2 )}, 

K,i = ((xKijj^.i),^©) i A*^(X,V,Y 2 )}. 

3) Decoding at R x2 fails: E 2>i = E' 2i \J E 2i , 
EL , = fizz G S', for which 

(x(™ ,i),v(z e ),y 2 (i)) G A: {n) (X,V,Y 2 )}, 

E'^ = |3w ^ woa,we C 2 (i) s.t. 3z v G S^, , 

(xH,v(z fi ),y 2 ( l ))G A: (n) (X,l/,r 2 )}. 

Next, applying the union bound we get that 



Pj"> = Pv(\jE k A 
\fc=o / 

= Pr(S 0) i)+Pr (X^n^o, 

+ Pr (^fl^fl^ 



2) Bounding the Probabilities of the Error Events: Follow- 
ing the same argument as in section IV-D, R[ > I(V;Yi) 
implies that taking n large enough, we can make Pr(I?o,i) < £• 
Next, from the properties of strongly typical sequences (see 
[43, lemma 13.6.1]), by taking n large enough, we can make 
Pr(£^ 2 ) < I. Additionally, the Markov lemma, [50, lemma 
4.2] implies that we can make Pr(.Ef ;i f| Efy f| Sg J < f 
for any arbitrary e > by taking n large enough. Therefore, 
by the union bound, Pr(_Ei » D -^0 i) < e. We also have that 

pK^n^Mn^i) = ° because ™ der £ i,n% we 

have that x(woj), y 2 (*) and v(z{, ,,) are jointly typical, and 

by construction, z% , e S', . Hence, we need to show that 

s ;+i 

the probability Vx(E , 2 i p| E\ i f] E^ J can be made arbitrarily 
small. Note that due to the symmetry of the construction, the 
probability of error does not depend on the specific message 
ujg.i transmitted. 



3) Bounding Pr^^ f| ^ fl The probability of 
E 2,i fl E ii fl E oi can be written as ' 

Pr^n^ifl^) 

= Pr ^3z v G S' s , i+i ,3w ^ w 0ti ,w G C 2 (i), 

(xH, y2 (i), v(z c )) g y 2 , v)} p| f) s c ;i 

( = Pr 7^ u; ,i, w G £a(i)i 

(xH, y 2 (i), v(^)) G A e *W (X, F 2 , t> )} f) 1^ fl El 
+ Pr (j=lw ^ wo t i, w G £ 2 (i), 3z f , G S' a , i+i ,zn ^ £o,j, 

(xH, y2 (i),v(^)) g A e *W(x,y a ,v)}n ^f ^o c , 

where (a) is because the elements of S', are selected in an 

s ;+i 

independent manner. 

We first bound Pr (E' 2 \ A as follows: 

Pr(E'l hi ) = 

E Pr ^ w ,i, w G £2(1)1 (x(u;),y 2 (i),v(0i,i)) 

G A^)(X,y 2) F)}f| E^f ^|£ 2 (*)) Pr(£ 2 (i)) 
£ Pr({(xH,y 2 ( l ),v(z s , l )) G 



(a) 

< £ 



A*M(x,Y 2 ,v)}r\m,ir\ E °^y* ( j) 



R 



y-2 



E E 



Pr (v|y 2 («),x(w)) 



toe£2(" ; ) _ v£ 

0^»o,i A; ( " ) (V|y 2 (i),x(tu)) 



Pr(v|y 2 (z)) 



E E 

u>e£a(i) veA* ( " ) (i>jy 2 (i),x(to)) 

<E y A J2 ii^: (n) (^iy 2 «,xH)iix 

max |Pr(v|y 2 (i)) j > 



ue£ 2 (i) 



(y2(i),v)6A^"'(y 2 ,y) 



< E ^ J 2 r l (H(V|Y 2 ,A')+2r ; ) 2 -r l (ff(V|y 2 )-2^) 

U'G-C 2 (i) 

<£; y2 {||£ 2 ( l )||}2-"(^i y2 )-^i y - x )- 4 '"), 

where (a) is because £ 2 (i) is a deterministic function of y 2 (i) 
and we also applied the union bound and (b) is because v{z$ t i) 
is independent of x(u>) for w 7^ Wq^. The bounds in (c) on 
the size of the conditionally typical set and the maximum 
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conditional probability follow from [47, theorem 5.2] with 
rj — > as e — > 0, assuming that n is large enough. Lastly 
we note that here 

Pr(y 2 (*)) = Pr (y 2 («) received | x(wo,i) transmitted) . 

Next, applying the same technique to bound the expectation of 
|£ 2 (*)ll as m [34, theorem 1] (see also derivation of equation 
(A.l)), we get that for n large enough, 



E 



|£ 2 (i)l|} < l + 2^ R -^ X ' Y ^ +3 ^. 



(B.l) 



Plugging this back into the bound on Pr [E 21 A we get that 
Pr(££ M ) < 2- n ( I ( x -V\Y*)-*v) 

+ 2n(R-I(X;Y 2 )-H(V\Y 2 )+H(V\Y 2 ,X)+7r)) 2 ) 

which can be made less than any arbitrary e > by taking n 
large enough, as long as 4 



R<I{X;Y 2 ) - H(V\Y 2 ,X) + H(V\Y 2 ). 



(B.3) 



For bounding Pr(E' 22i ) we begin essentially in the same 
manner and get that 



Pr(R'l 

(a) 

: E v 



2,i) 



Pr (j^w ^ w ,i,w € £ 2 («),=lz.e e 

z fi 7^ za ti , (x(w),y 2 (i), v(««)) € 

A:W(X,y 2 ,f)}|y 2 (i),v(^)) 



(6) 

< £ 



y2,v N 



Pr((xM,y 2 (i),v(z*)) e 

W 

^W(X,Y 2) y)|y 2 (i),vKi) 



(c) 



w EE E p r(v) 

^ ZiSS^, w££ 2 (i) veA e * ( " ) (y|y 2 (i).x(io)) 

Zv^Zv^i 



(d) 



(e) 
< 



v{l|5^j|}^{||r a (i)||}2--(^-^i^-*») 

^ + 2 n(i?/ 1 -C 12 )^j ^ + 2 n(i?-/(X;y 2 )+3r,) \ x 

2-n(-fr(v)--fr(v|r 2 ,x)-3??) 

< 2- n ( C ^+H(V)-R' 1 -H(V\Y 2 ,X)-3ri) 

, 2 "(-R--f(X;Y 2 )-/(V';y 2 ,X)+6r,) , 2 -n(/(V r ;Y 2 ,X)-37 ) ) 
_|_ 2'i(fl--f(^;y 2 )-C , i 2 +fl' 1 -H(V)+H(V'|y 2 ,X)+6r,) 

where (a) is because we dropped the intersection with 
El i P| Eq it (b) is due to the union bound, (c) is because v(zs) 

4 We assume that I(X; V\Y%) > otherwise the relay message does not 
help decoding the source message at R x 2- 



is independent of x(u>) and y 2 (i) when z% 7^ z$ t i, and (d) is 
because 

E y2 ,v{\\£ 2 (i)\\-\\S' <+1 \\} 

= ^y 2 {^v| y2 {II^WI|-H^ +i l|}} 

( ^i? y2 {||/: 2 (z)||^| y2 {||^ +i ||}} 

y2 {||£ 2 (i)||^{||^ +i ||}} 

y 2 {\\C2(i)\\}E^{\\S' s ,J\}, 



(a) 



E 



where (f) is because the average size of C 2 (i) does not depend 
on \{z<n t i) when y 2 (i) is given, and (g) is because the average 
size of S', does not depend of y 2 (i). The bounds on Pr(v) 

and P* (l ^0>|y 2 ,x)|| in (d) follow from [47, Ch. 5]. The 
bound on E y2 {\\C 2 (i)\\} in (e) follows from equation (B.l). 
We note that here 

Pr(y2(i),v(z6,i)) = Pr v(zo,i)) received | x(ioo,t) 

transmitted 



We conclude that Pr (R 22 A can be made smaller than any 
e > by taking n large enough, as long as 

R < I(X;Y 2 )-H(V\Y 2 ,X) + C 12 -R[ + H(V) (B.4) 
R'i < C 12 - H(V\Y 2 , X) + H(V) (B.5) 

R < I(X; Y 2 ) + I(V; Y 2 , X) (B.6) 
iZi >/(V-;ri), (B.7) 

where (B.7) follows from appendix B-F.2. 

Now note that making Pr(2? 2 i f] R{ i f) E$ A ) arbitrarily 
small requires making both Pr(-E 21i ) and Pr(i? 2 2 J arbi- 
trarily small. Thus we also need to satisfy (B.3). Combining 
with (B.6) we see that (B.3) guarantees (B.6) and we are left 
with (B.3), (B.4), (B.5) and (B.7). 

The maximum rate is achieved for the minimal R[, therefore 
we plug R[ = I(V; Y{) in (B.4) and combining with (B.3) we 
obtain the following achievable rate 

R < I(X;Y 2 )-H(V\Y 2 ,X) 

+ min fCia + ff^^i), H (V\Y 2 )) . (B.8) 



From the combination of (B.5) and (B.7), we conclude that 
this is achievable as long as 



Cia > I(V; Yi) + H(V\Y 2 ,X) - H(V) 
= H{V\Y 2 ,X)-H{V\Y 1 ) 
= /(V;Yi|X,Y 2 ). 



(B.9) 



Equations (B.8) and (B.9) give the conditions for the message 
W to be decoded at R x2 with an arbitrarily small probability 
of error by taking n large enough. Note that the requirement in 
(B.9) implies that when C 12 < I(V;Y 1 \Y 2 , X), R xl cannot 
use this cooperation scheme, and the rate to R x2 is simply 
I(X;Y 2 ). Combining this with equation (B.8) yields the rate 
expression in (48) and (49). 
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